Rank | Count | Beginning |
---|---|---|
64647 | 255 | 本报讯 |
64784 | 98 | 本报讯(记者 |
98159 | 60 | 青海新闻网讯 |
57764 | 31 | 据新华社电 |
72903 | 22 | 玉溪新闻网讯(记者 |
64800 | 21 | 本报讯(记者 |
70697 | 21 | 泰顺新闻网讯(记者 |
33722 | 19 | 商报讯 |
62452 | 18 | 晚报讯 |
40833 | 16 | 太原新闻网讯 |
64633 | 11 | 本报综合消息 |
65105 | 11 | 本报讯(首席记者 |
71258 | 11 | 深圳新闻网讯 |
64692 | 10 | 本报讯(实习记者 |
64795 | 9 | 本报讯 (记者 |
62549 | 8 | 晨报讯 |
65123 | 8 | (本报记者 |
72893 | 8 | 玉溪新闻网讯 |
62521 | 7 | 晚报记者 |
65162 | 7 | 本报长沙讯 |
2787 | 6 | 2 |
59702 | 6 | 新华社电 |
4562 | 5 | 6. |
8234 | 5 | 上一篇 |
55265 | 5 | 扬子晚报网消息 |
70150 | 5 | 沈阳晚报讯(记者 |
70690 | 5 | 泰顺新闻网讯 |
78669 | 5 | 第三条 |
78975 | 5 | 第八条 |
79001 | 5 | 第六条 |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV